\documentclass[a4paper,11pt]{scrartcl}

\usepackage{fontspec}
\defaultfontfeatures{Ligatures=TeX}

\setromanfont{Sabon}
\setsansfont{Optima}
\usepackage{graphicx,url}
\usepackage{amssymb,amsmath}

\title{MetiTarski User Guide}
\date{21 October 2014}
\author{Lawrence C Paulson \\ University of Cambridge}

\begin{document}
\maketitle
\section{Introduction}
  MetiTarski is an automatic theorem prover based on a combination of resolution and a decision procedure  for the theory of real closed fields. It is designed to prove theorems involving real-valued special functions such as log, exp, sin, cos and sqrt.   In particular, it is designed to prove universally quantified inequalities involving such  functions. This problem is undecidable, so MetiTarski is necessarily incomplete. Nevertheless, MetiTarski is remarkably powerful. Here are a few of the hundreds of theorems that it can prove, automatically and in seconds.

% There are some substantial runtimes below, but with Mathematica they are much faster.
% These runtimes are for QEPCAD.
%%{\tt CONVOI2-sincos} 2.2 secs
\begin{multline*}
\forall\, t > 0, v > 0\\
((1.565 + 0.313\,v)\,\cos(1.16\,t) + (.0134 + .00268\,v)\,\sin(1.16\,t))\,e^{-1.34\,t}\\
- (6.55 + 1.31\,v)\,e^{-0.318\,t} + v \,\geq\, {-10}
\end{multline*}
%%{\tt exp-problem-9} 3.3 seconds;    2.1 seconds (RCF)
\[ \forall\, x > 0 \Longrightarrow \frac{1 - e^{-2\, x}}{2\,x\,(1 - e^{-x})^2} - \frac{1}{x^2} \leq \frac{1}{12}
\]
%%{\tt log-fun-ineq-e-weak} 227.3 seconds;  226.4 seconds (QEPCAD). 58.1 s;   57.6 s (M)
\[\forall\, x\, y,\ x \in (0, 12)\Longrightarrow
      x y \leq \frac{1}{5} + x\,\ln(x) + e^{y-1}
\]
%%{\tt max-sin-2} 49.5 seconds;   44.4 seconds (RCF),
\[\forall\,x \in (-8,5)\Longrightarrow \max(\sin(x), \sin(x + 4), \cos(x)) > 0
\]
%%{\tt sin-3425b} 115.4 seconds;  115.1 seconds (RCF)
\[\forall x\, y ,\  (0<x < y \land y^2 < 6) \Longrightarrow \frac{\sin(y)}{\sin(x)} \leq 10^{-4} + \frac{y - \frac{1}{6}y^3 + \frac{1}{120}y^5}{x - \frac{1}{6}x^3 + \frac{1}{120}x^5}
\]
%%{\tt sqrt-problem-13-sqrt3} 9.6 seconds;    7.0 seconds (RCF)
\[\forall x \in (0, 1)\Longrightarrow 1.914 \frac{\sqrt{1+x} - \sqrt{1-x}}{4 + \sqrt{1+x} + \sqrt{1-x}} \leq 0.01 + \frac{x}{2+\sqrt{1-x^2}}
\]
%%{\tt tan-1-1var-weak}  7.2 seconds;    5.2 seconds (RCF)
\[\forall x \in (0, 1.25) \Longrightarrow \tan(x)^2 \leq 1.75\, 10^{-7} + \tan(1)\tan(x^2)
\]
%%{\tt trig-squared3} 421.7 seconds;  419.1 seconds (RCF)
\[\forall x \in (-1, 1), y \in (-1, 1) \Longrightarrow \cos(x)^2 - \cos(y)^2 \leq -\sin(x+y)\sin(x-y) + 0.25
\]
%%{\tt trig-squared4} 484.2 seconds;  481.4 seconds (RCF)
\[\forall x \in (-1, 1), y \in (-1, 1) \Longrightarrow \cos(x)^2 - \cos(y)^2 \geq -\sin(x+y)\sin(x-y) - 0.25
\]
%%{\tt trigpoly-3514-2} 10.2 seconds;    9.3 seconds (RCF)
\[\forall x \in (-\pi, \pi) \Longrightarrow 2\,|\sin(x)| + |\sin(2\,x)| \leq \frac{9}{\pi}
\]
%

Work on MetiTarski started in 2006, jointly with Behzad Akbarpour. The original manual experiments \cite{akbarpour-towards}
were later automated with the help of a basic decision procedure~\cite{akbarpour-extending}.
Crucial is the use of upper and lower bounds for the special functions of interest.
By 2008, MetiTarski had a basic set of such bounds, largely based on Taylor expansions, and using QEPCAD as its decision procedure \cite{akbarpour2008}.
The next major milestone was the introduction of upper and lower bounds based on continued fractions \cite{metitarski-jar}.
Until this stage, MetiTarski's underlying proof engine was based on standard resolution. The introduction of case-splitting (originally in simulated form, later superseded by James Bridge's implementation of backtracking) provided decisive additional power for many problems \cite{bridge-case}.
Other work, chiefly by Grant Passmore, extended the range of decision procedures to include Mathematica and Z3; later, Z3 with a specialised ``strategy~1'' became the default configuration~\cite{passmore-real}.
Passmore implemented an interval constraint solver and many other refinements.

The development of MetiTarski went hand-in-hand with a variety of applications, chiefly at Concordia University (in Montréal) and Cambridge.
These applications chiefly concerned hybrid systems \cite{akbarpour-hscc} and analogue circuits \cite{denman-analog,narayanan-formal}.
William Denman has continued to explore applications mostly in the aerospace domain
\cite{denman-quantum,denman-pvs,denman-flight}.

For definitive and comprehensive about MetiTarski, consult the journal articles \cite{bridge-case,metitarski-jar}.
For more casual overviews and speculations about future research, some of Paulson's invited lectures \cite{paulson-past,paulson-phase} are worth reading.
Links to most of these papers can be found on the MetiTarski web page, \url{www.cl.cam.ac.uk/~lp15/papers/Arith/}.

\section{Compiling and installing MetiTarski}

Before compiling and installing MetiTarski, first install Poly/ML (from \url{www.polyml.org})
and at least one decision procedure as outlined below.


\subsection{Installing a decision procedure}

MetiTarski requires an external algebraic decision method (EADM).
MetiTarski supports three such tools. For each, certain \textit{environment variables} must be set up to locate them.


\begin{description}\sloppy
\item[QEPCAD B version 1.69 or later]
is available from \url{http://www.usna.edu/CS/qepcadweb/B/QEPCAD.html} in source form.
The environment variable \texttt{qe} must point to the qesource subdirectory of the QEPCAD B installation, so that \texttt{\$qe/bin/qepcad} is the compiled QEPCAD B binary.


\item[Mathematica version 8.0 or later]
is a commercial product.
The environment variable \verb|MATH_KERNEL| must point to the binary for the console-mode Mathematica interface. This file is called
\texttt{math} on Linux and \texttt{MathKernel.exe} on Windows.
On Mac OS X, it must point into the Mathematica application itself:
\begin{verbatim}
/Applications/Mathematica.app/Contents/MacOS/MathKernel
\end{verbatim}

\begin{quote}
  \textit{Remark}: we have recently encountered the following error message when attempting to use Mathematica under Mac OS X:

\ \ \ \texttt{dlopen(\ldots/SystemFiles, 1): image not found}

A workaround is to create the missing file as a symbolic link to

\ \ \ \texttt{/Applications/Mathematica.app/SystemFiles} .
\end{quote}



\item[Z3 version 4.0 or later]
can be obtained from \url{z3.codeplex.com/}.
The environment variable \verb|Z3_NONLIN| must point to the Z3 binary.
\end{description}


The default is Z3, which is available in binary form and is free to non-commercial users.
QEPCAD is also free, but must be built from sources. QEPCAD is best for univariate
problems. Mathematica is available in many institutions under a site license, and is very good
for problems involving three or four variables.

\subsection{Building MetiTarski}

To build MetiTarski, first install Poly/ML version 5.5 or later.
For improved performance, you can build Poly/ML using the GMP multiple-precision arithmetic package: first install GMP (from \url{gmplib.org}) and then build Poly/ML including the line
\begin{verbatim}
./configure --with-gmp=yes
\end{verbatim}
%
You may need to set the variables \texttt{LDFLAGS}, \texttt{CFLAGS} and \texttt{CXXFLAGS} to locate \texttt{gmp}.

To compile and install MetiTarski in its default location of \texttt{/usr/local/bin}, enter the
following three commands:
\begin{verbatim}
./configure
make
sudo make install
\end{verbatim}

To install MetiTarski in a different directory, use the \texttt{--prefix} option with the \texttt{configure}
command. For example, to install it in subdirectory \texttt{bin} of your home directory, use
\begin{flushleft}\ttfamily
./configure --prefix=\$HOME
\end{flushleft}
%
NB the prefix should \textit{not} end with ``\texttt{bin}'', as this will be added automatically.

\section{Axiom and problem files}

Before preparing your own problems, please look at our many examples.
They will give you an impression of the sort of problem MetiTarski can solve.
MetiTarski can include relevant axiom files automatically (using the
option \texttt{--autoInclude}). Better results can be obtained if you select the axiom files
yourself, omitting those that aren't strictly necessary. But this requires
some experience and skill.

Directory \texttt{tptp} contains many axiom and problem files. The problem files use the
\texttt{include} directive to insert axioms relevant to the functions mentioned in
those problems. Here is an example:

\begin{verbatim}
include('Axioms/general.ax').
\end{verbatim}

The pathname mentioned by such a directive is looked up relative
to a base directory, which can be specified to MetiTarski using the \texttt{--tptp}
command line option. If that option is not used, the base directory is obtained
from the current setting of the \texttt{TPTP} environment variable.  However, if a
pathname starts with the ``\texttt{/}'' character, then it is regarded as absolute and no
base directory is used. If an included file is not found, then MetiTarski will
terminate with an error message.

To run MetiTarski on an individual problem, use the binary executable, metit.
here is an example:
\begin{verbatim}
metit --verbose 0 --show proof --tptp .. atan-problem-15.tptp
\end{verbatim}

Here we see three options being given. The first suppresses all prover output
apart from the final result.  The second requests the prover to display the final proof.
The default verbosity is 1, whilst 2 and 3 present more detail. The third option
specifies the TPTP directory.  The last item on the command line is the filename
containing the problem; you may put a series of file names here and the problems
will be attempted sequentially.

Version 1.4 introduced experimental, limited support for problems involving
existential quantification. In the supplied set of problems, the existential
ones have names beginning with the letter X. Suggestions for more interesting
problems are welcome! These cannot be solved using Z3 as the EADM.


\section{Defining your own functions}


Non-recursive functions (effectively, abbreviations) can be defined using axioms
such as the following:
\begin{verbatim}
cnf(f, axiom, (f(X) = 2*(X - X^3))).
\end{verbatim}

An example of using such functions is the problem \texttt{ellipse-check-2-fun.tptp}.
More examples are available, in files with names matching the pattern \texttt{*-fun.tptp}.
Functions may refer to one another, but it is the user's responsibility to avoid
recursive definitions, which are \textit{not} checked for termination.

\section{Command line options}

\subsection{Options to specify the EADM}

The default algebraic decision method is Z3 with ``Strategy 1''
\cite{passmore-real}.

To specify a different decision method, use one of the following options:
\begin{description}

\item[\texttt{--z3}]
      Use plain Z3 (instead of the default Strategy 1)

\item[\texttt{-m} or \texttt{--mathematica}]
      Use Mathematica as the external algebraic decision method.

\item[\texttt{--qepcad}]
      Use QEPCAD as the external algebraic decision method.

\item[\texttt{--nsatz\_eadm}]
      Use MetiTarski's built-in decision method (combining ICP with a search for real
      nullstellensatz witnesses) before calling the external one.
\end{description}

One of the options \texttt{--z3}, \texttt{--mathematica} or \texttt{--qepcad} must be given.  A good combination is \texttt{--nsatz\_eadm} \texttt{-m}.
      This is an implementation of Passmore's ICP-enhanced Tiwari calculus \cite[\S6.3.3]{passmore-phd}

Other command-line options control MetiTarski's operation and heuristics. The following
options are the most important. The backtracking and case-splitting settings can
make the difference between success and failure, and unfortunately there is no
obvious way to choose the best ones for a given problem. Most of the numerical
parameters below are only for advanced users.

\subsection{Basic options}

\begin{description}

\item[\texttt{-p} or \texttt{--show proof}]
      A proof (if found) will be produced on the standard output.

\item[\texttt{--time} $\langle float\rangle$]
      limits the processor time used in the proof attempt (default 600 seconds);
      decimals are allowed, e.g., 0.1 for 100 milliseconds

\item[\texttt{--autoInclude}, \texttt{--autoIncludeExtended}, \texttt{--autoIncludeSuperExtended}]
      includes axiom files automatically
      (with extended or extra extended accuracy, respectively)

\item[\texttt{-t} or \texttt{--verbose} $\langle 0\ldots 4\rangle$]
      specifies the degree of verbosity for the proof search
      The default is 1, which displays information about CPU usage:
      a full stop (\texttt{.})\ for every 10 seconds of Metis time and a (\texttt{+}) for every 10 secs of RCF time.
\end{description}

\subsection{More advanced heuristic settings}
* Indicates the default alternative.

\begin{description}
\item[\texttt{--backtracking} \texttt{off} / \texttt{on}*]
       disables/enables backtracking.
\item[-\texttt{-cases off}]
      switches off case splitting.
\item[\texttt{--cases} $m$]
      If backtracking is on, then $m$ is the maximum size of the split stack (default 100);
      if backtracking is off, then $m$ is the maximum number of splits (default 3000).
\item[\texttt{--cases} $m$\texttt{+}$n$]
      $m$ is as above, while $n$ sets the weight factor for non-SOS clauses in tenths.
      Thus a value of $n$ of 10 is neutral (the factor is 1.0 and the weight is unchanged).
\item[\texttt{--proj\_ord} \texttt{off} / \texttt{on}*]
      enables/disables automatic selection of a projection ordering for QEPCAD, which
      can have a great impact on multivariate problems.
\item[\texttt{--maxalg} $\langle n\rangle$]
      sets the maximum size of an algebraic clause to be retained as part of the context
      in QEPCAD calls. A value of 50 or 75 may benefit some problems that have two or
      three variables, while a few problems require at least 500. The default is 100.
\item[\texttt{--rerun} \texttt{off} / \texttt{on}* ]
      controls whether to try again (with maxalg = 1001) after running out of clauses,
      instead of just giving up.
\item[\texttt{--paramodulation} \texttt{off} / \texttt{on}* ]
      disables/enables the paramodulation rule.
\item[\texttt{--maxweight} $\langle w\rangle$]
      sets the maximum weight of a retained clause, default 1000 * (.5 + .5 * SOS\_factor).
      Smaller values save memory but may cause MetiTarski to quit prematurely because
      it has run out of clauses. The maximum weight observed in any proof is 1007.
\item[\texttt{--tptp} $\langle d\rangle$]
      specifies the TPTP installation directory
\item[\texttt{--tstp}]
      generates standard TSTP proofs (no infixes, etc.), for use with TSTP analysis tools
\item[\texttt{--full}]
      includes variable instantiations in proofs
\item[\texttt{-q} or \texttt{--quiet}]
      no output: indicates provability with return value

\end{description}

\section{Perl scripts}

Directory scripts contains Perl scripts that are useful for generating and running
problem sets: runmetit and addaxioms.

MetiTarski can be given a list of problem files on the command line, but the
Perl script \texttt{runmetit.pl} provides more flexibility.  It is especially useful when
many problems are to be attempted, each for a limited time.
A log file summarising the outcomes for all the problems will be stored in a file entitled
\texttt{STATUS-Metit}-\textit{yyyy}-\textit{mm}-\textit{dd}.
Here, we use runmetit to run MetiTarski on a directory of problems.
\begin{flushleft}\ttfamily
runmetit.pl --time 10 --proofs --tptp \$HOME/metit-2.4/tptp
\end{flushleft}

Here is a summary of the main options. Others are documented in the script itself.

\begin{description}

\item[\texttt{--time}] limits the processor time in seconds. It is not especially accurate.

\item[\texttt{--proofs}] produces proofs as new files, in a directory entitled Proofs-Metit-yyyy-mm-dd.

\item[\texttt{--threads} (default 2)] specifies the maximum number of threads to be used in parallel.
  (Too many threads may crash your machine! Your Mathematica licence may forbid
   multiple threads.)

\item[\texttt{--options "$opts$"}] passes the given option string $opts$ to MetiTarski.

\end{description}


The script \texttt{addaxioms.pl} expands \texttt{include} directives at the source level in a set
of problem files, creating a directory of problem files in which each occurrence
of \texttt{include} has been replaced by the corresponding axioms. It can be useful for
debugging, perhaps to eliminate axioms that you believe to be irrelevant in the
hope of obtaining proof. The TPTP base directory is identified, as usual, by the
\texttt{--tptp} option if it is provided and otherwise by the \texttt{TPTP} environment variable.

For example, one way to generate a directory of expanded problems is to visit
the tptp directory and type the command
\begin{verbatim}
addaxioms.pl --in Problems --outdir TestDir --tptp .
\end{verbatim}
%
You can add further axiom files to all problems by naming them in this command.

\section{Input syntax and problem preparation}

Problem syntax uses TPTP format extended with infix notation for arithmetic formulas.
Decimal notation is accepted: a decimal such as \texttt{.23} abbreviates the fraction 23/100.
To express a floating point number, write an expression such as \texttt{1.04e-18} or \texttt{4E5}.
Parsing is done using ML-YACC. Please note some quirks of the grammar:

\begin{enumerate}
\item A \texttt{cnf} line introduces a clause, which \textit{must} be enclosed within a pair of parentheses
   even if it consists of a single literal. Example:
\begin{verbatim}
	cnf(sqrt_squared, axiom, (X < 0 | sqrt(X)*sqrt(X) = X)).
\end{verbatim}

\item The syntax for formulas only allows parentheses around a non-trivial formula.
    In particular, \texttt{(p(x))} and \texttt{((x=y))} are forbidden by the parser. But to allow for computer generated
    files in particular that have such redundant brackets, MetiTarski now has a preprocessor that
    specifically removes them. It is still better to avoid creating such files in the first place.

\item Free variables are forbidden in first-order formulas, that is, \texttt{fof} lines. All variables must be quantified.
\end{enumerate}

\subsection{Interval syntax}

A special syntax for intervals is available.
\begin{verbatim}
t : (a,b)
\end{verbatim}

This formula is logically equivalent to the conjunction $a<t \wedge t<b$. Here, $t$, $a$ and $b$
are all terms. Membership in a closed interval is written

\begin{verbatim}
t : (=a,b=)
\end{verbatim}
%
The open and closed brackets can be mixed.
Thus \texttt{X :\ (=0,1)}
means $0 \leq X \wedge X < 1$.

\subsection{The sine and cosine functions}

The approximations for the sine and cosine functions become extremely inaccurate
once their argument exceeds 6 in absolute value. Proofs concerning those
functions outside of this range will almost certainly fail. For many
applications, a proof can be obtained by generalising the problem to replace
$\sin t$ and $\cos t$ by new variables $X$ and $Y$ subject to the constraint $X^2+Y^2=1$.
Note that a separate pair of variables is necessary for each
different argument, $t$. The Chua problems are available both with and without this transformation.

We provide extended approximations for sin and cos as well,
but the arguments must still be strictly limited. The comments in
the axiom files give specific information concerning accuracy.

\subsection{Miscellaneous remarks}

The natural logarithm is written as ln. In contrast, the common logarithm (base
10) is written as log.  It is defined in the axiom file \texttt{log.ax}.
Note that many problem files with \texttt{log} in their name refer to the natural logarithm.

Unfortunately, the decision procedure is hyper-exponential in the number of
variables, and MetiTarski is unlikely to be useful for problems containing more
than five variables.  A few nine-variable theorems have been proved.

MetiTarski can seldom prove equalities.  Inequalities can only be exact at the point
around which the power series of the relevant function has been expanded, typically
0 or~1.

\section{Interpreting the output}

By default, MetiTarski produces very little output, only a ticker indication
of processor time used. You can suppress even this output, or produce traces
in various levels of detail, using the \texttt{--verbose} option mentioned above.
Detailed traces are not easy to interpret, even by experts.

When it terminates, MetiTarski prints a few statistics, including the processor
time used. The proof is displayed if you requested this. There should also be a
status line summarising the outcome. Naturally you want status Theorem, but
other possibilities are Timeout and GaveUp, the latter meaning that MetiTarski
ran out of clauses to process. Because MetiTarski is incomplete, we cannot
conclude anything from its failure to prove a theorem. This can happen because
you have not included necessary axioms, because the theorem is too difficult to
prove, or because it is simply not amenable to the methods used by MetiTarski.

You can improve MetiTarski's performance by including only necessary axiom files
(for example, don't include upper bounds if they aren't needed, which often is
clear by inspection). Also, if you are trying to solve an engineering problem,
writing decimal numbers to 10 significant figures will create extra work for
MetiTarski to deliver an accuracy that can have no practical use. The necessary
tolerances should be determined by your application, and in most cases you
should not need more than three significant figures. (Note that a standard
resistor is only accurate to 10\%.)

You will naturally want to try MetiTarski on your own problems, after converting them
to MetiTarski syntax and inserting the necessary \texttt{include} directives. Use the
supplied problems as examples. Please share any new problems, whether
MetiTarski can solve them or not.

\section*{Acknowledgements}

Other team members include Behzad Akbarpour, James Bridge and Grant Passmore.
Research supported by the Engineering and Physical Sciences Research Council [grant numbers EP/C013409/1 and EP/I011005/1]. MetiTarski is a modified version of Joe Hurd's Metis prover.

\bibliographystyle{plain}
\bibliography{string,atp,general,isabelle,theory,crossref}


\end{document}
